Crime is declining. But residents don't feel safer. The perception gap — the distance between what the statistics say and what people experience on the street — is invisible to current measurement systems. The City Survey captures it once every two years. Public Safety Pulse would capture it every day, at block level.
"Safety isn't just a statistic; it's a feeling you hold when you're walking down the street."
Three components that create a powerful feedback loop:
The 2023 City Survey (most recent) shows perception at its lowest point since 1996. But the 2025 CityBeat poll shows improvement among frequent downtown visitors. Without Public Safety Pulse, we can't see how this varies by block, time of day, or in response to interventions.
Every point on this map is a real 311 report, SFPD incident, or traffic crash — weighted by its impact on perception. Encampments and violent crime produce stronger heat signals than graffiti or property offenses. Press play to watch hotspots migrate through a typical day. Neighborhood labels show SPI scores for context.
With Phase 1 data flowing, the dashboard would generate automatic alerts when perception drops below thresholds:
SPI fuses 7 data signals into a single 0–100 score per neighborhood. Higher = safer feeling. This is a proxy estimate from publicly available data. Phase 1 would validate and calibrate these scores against direct perception measurements.
Positive = more disorder than crime (perception problem). Negative = more crime than disorder (hidden risk).
Darker cells = more reports per month. This is the variation a biennial survey cannot capture.
Once Phase 1 provides direct perception data, the correlation engine identifies which observable factors actually predict how safe people feel. This replaces assumed weights with empirical ones.
Feed all available data into a regression model with direct sentiment as the dependent variable. The model reveals which levers — cleaning, lighting, ambassador presence, encampment density — most strongly correlate with perception. Then test interventions and measure impact.
Based on current data: neighborhoods with higher values of these indicators tend to have lower SPI scores. These are associations, not causal findings — Phase 1 enables causal testing via interventions.
Likely to directly measure or predict sentiment:
Known correlates from urban research:
| Source | Records | SPI Role | Weight |
|---|---|---|---|
| 311 Requests vw6y-z8j6 | 626,911 | Disorder density + salience + temporal + resolution | 25% + 13% + 13% + 15% |
| SFPD Incidents wg3w-h783 | 85,804 | Crime severity density | 18% |
| Traffic Crashes ubvf-ztfx | 2,467 | Pedestrian safety | 8% |
| Reddit r/sanfrancisco | 319 | Community sentiment baseline | 8% |
SPI = 100 − scaled(25%×D + 18%×C + 13%×DC + 8%×PS + 13%×TR + 8%×CS + 15%×RR)
D = z_score( Σ(311_cases × e^(-days/180)) / area_km² )
C = z_score( Σ(SFPD × severity_weight × e^(-days/180)) / area_km² )
DC = z_score( mean_salience_weight per neighborhood )
PS = z_score( crashes / area_km² )
TR = z_score( 0.6 × night_ratio + 0.4 × trend_ratio )
CS = keyword_sentiment from Reddit (neighborhood-specific where possible)
RR = z_score( median_311_resolution_days )
| Dataset | Access | Signal | Priority |
|---|---|---|---|
| BART Station Exits | bart.gov (manual download) | Foot traffic avoidance — declining exits = people avoiding the area | High |
| Muni Ridership | sfmta.com | Transit confidence indicator | Medium |
| Yelp/Google Reviews | Yelp Fusion API (free) | Direct safety sentiment from real visitors at specific locations | High |
| Business Vacancy | CBRE/Cushman & Wakefield quarterly | Empty storefronts signal neighborhood decline | Medium |
| Streetlight Outages | Already in 311 data (needs extraction) | Lighting = strongest single perception predictor | High |
| Google Trends | Free API | Search volume for "[neighborhood] + safety" keywords | Medium |
| City Survey Microdata | Controller's Office request | Enables regression-based weight calibration | Critical |
| Replica/SafeGraph | Commercial (MIT partnership) | Mobility patterns — where people avoid walking | High |
| Pedestrian Counts | SFMTA automated counters | Foot traffic decline = avoidance behavior | Medium |
| Nextdoor Posts | Partnership needed | Hyperlocal community safety discussion | Medium |
Current approach: Weighted z-score fusion with assumed weights from literature.
Phase 1 upgrade: Principal Component Analysis to discover which signals co-vary with
direct perception → Bayesian hierarchical regression for weight calibration → spatial autocorrelation
(Moran's I) to account for neighboring block influence → temporal autoregression for trend prediction.
Research basis: Wilson & Kelling (1982) Broken Windows, Sampson & Raudenbush (1999) Disorder Observation,
Salesses et al. (2013) MIT Place Pulse, Naik et al. (2014) MIT Streetscore, Welsh & Farrington (2008) Lighting & Crime.
Reporting bias: 311 reflects who reports. Engaged neighborhoods over-report.
Survival bias: Areas people avoid generate fewer data points.
No direct perception: Everything here is inferred from proxy data.
Reddit: Only 319 posts, not geocoded.
Weights: Not empirically calibrated. Phase 1 fixes this.
311 captures what people report. Crime data captures what police file. Areas people avoid appear safe. We need the actual signal — how people feel, in the moment, at the places they actually are.
"Right now, how does the surrounding area feel to you?" — Comfortable / Neutral / Uncomfortable. Through existing digital touchpoints. Anonymous. Aggregated by place and time.
City Science Lab San Francisco × MIT Media Lab City Science